Infrared and visible light fusion method

ABSTRACT

The present invention provides an infrared and visible light fusion method, and belongs to the field of image processing and computer vision. The present invention adopts a pair of infrared binocular camera and visible light binocular camera to acquire images, relates to the construction of a fusion image pyramid and a significant vision enhancement algorithm, and is an infrared and visible light fusion algorithm using multi-scale transform. The present invention uses the binocular cameras and NVIDIATX2 to construct a high-performance computing platform and to construct a high-performance solving algorithm to obtain a high-quality infrared and visible light fusion image. The present invention constructs an image pyramid by designing a filtering template according to different imaging principles of infrared and visible light cameras, obtains image information at different scales, performs image super-resolution and significant enhancement, and finally achieves real-time performance through GPU acceleration.

TECHNICAL FIELD

The present invention belongs to the field of image processing and computer vision, adopts a pair of infrared camera and visible light camera to acquire images, relates to the construction of a fusion image pyramid and a significant vision enhancement algorithm, and is an infrared and visible light fusion algorithm using multi-scale transform.

BACKGROUND

The binocular stereo vision technology based on visible light band is developed to be relatively mature. Visible light imaging has rich contrast, color and shape information, so the matching information between binocular images can be obtained accurately and quickly so as to obtain scenario depth information. However, visible light band imaging has defects, and the imaging quality thereof is greatly reduced, for example, in strong light, fog rain, snow or night, which affects the matching precision. Therefore, the establishment of a color fusion system by using the complementarity of different band information sources is an effective way to produce more credible images in special environments. For example, a visible light band binocular camera and an infrared band binocular camera are used to form a multi-band stereo vision system, and the advantage of not being affected by fog, rain, snow and light of infrared imaging is used to can make up for the deficiency of visible light band imaging so as to obtain more complete and precise fusion information.

Image fusion is a promising research in the field of image processing. Images formed by two different types of imaging sensors or similar sensors with different focal lengths and exposures can be synthesized into a more informative image through the image fusion technology, which is more suitable for later processing and research. The advantages make image fusion widely developed in the fields such as remote sensing, camera or mobile phone imaging, monitoring and reconnaissance, and especially, the fusion of infrared and visible light images plays a very important role in the military field. In recent years, most of fusion methods are researched and designed based on the transform domain without considering the multi-scale detail information of images, resulting in the loss of details in the fused image, for example, the public patent CN208240087U [Chinese], an infrared and visible light fusion system and image fusion device. The traditional multi-scale transform is mainly composed of linear filters, and easily produces halo artifacts during the decomposition process. Edge keeping filters can avoid the phenomenon of halo artifacts at the edge while better retaining the edge characteristics of images, and obtain better results in image fusion, thus attracting more and more attention. Therefore, the present invention realizes the enhancement of details and the removal of artifacts on the basis of retaining the effective information of infrared and visible light images.

SUMMARY

To overcome the defects in the prior art, the present invention provides an infrared and visible light real-time fusion algorithm based on multi-scale pyramid transform, which constructs an image pyramid by designing a filtering template, obtains image information at different scales, performs image super-resolution and significant enhancement, and finally achieves real-time performance through GPU acceleration.

The present invention has the following specific technical solution:

An infrared and visible light fusion method, comprises the following steps:

1) Converting the color space of a visible light image from an RGB image to an HSV image, extracting the value information of the color image as the input of image fusion, and retaining the original hue and saturation;

2) Creating a filtering kernel template: constructing the corresponding filtering kernel template according to the sizes of infrared and visible light images; for example, when infrared and visible light images of 640*480 are input, the kernel template is as follows:

$\quad\begin{bmatrix} 0.015 & 0.031 & 0.062 & 0.031 & 0.015 \\ 0.031 & 0.062 & 0.125 & 0.062 & 0.031 \\ 0.062 & 0.125 & 0.25 & 0.125 & 0.062 \\ 0.031 & 0.062 & 0.125 & 0.062 & 0.031 \\ 0.015 & 0.031 & 0.062 & 0.031 & 0.015 \end{bmatrix}$

3) Constructing an image pyramid by using the filtering kernel template convolution;

4) Extracting the details of the pyramid at different scales by the linear interpolation method;

5) Distinguishing the details of infrared and visible light of each layer of the pyramid by using the saliency of the details, convolving the images by using the designed sliding window to generate a weight matrix, and comparing the neighborhood information of each pixel according to the weight matrix to distinguish and extract more credible detail images.

6) Linearly adding the infrared and visible light images of the smallest scale to obtain a smooth background image, fusing the extracted detail images into the background image, performing super-resolution upscaling on the image, then adding the detail information of the upper-layer scale, and iterating successively up to the top of the pyramid.

6-1) The super-resolution technology uses the cubic convolution interpolation to obtain richer details of a magnified image than the bilinear interpolation, the weight of each pixel value is determined by the distance from the pixel to the pixel to be determined, and the distance includes distances in horizontal and vertical directions.

7) Converting the color space: converting the fusion image back to the RGB image, and adding the hue and saturation previously retained.

8) Enhancing the color: Enhancing the color of the fusion image to generate a fusion image with higher resolution and contrast. Performing pixel-level image enhancement for the contrast of each pixel.

The present invention has the following beneficial effect: the present invention designs a real-time fusion method using infrared and visible light binocular stereo cameras. The present invention solves the details of infrared and visible light by using the multi-scale pyramid transform, carries out inverse pyramid transform with the super-resolution technology, constructs a highly credible fusion image, and has the following characteristics: (1) the system is easy to construct, and the input data can be acquired by using stereo binocular cameras; (2) the program is simple and easy to implement; (3) better details of the image are obtained by pyramid multi-scale transform; (4) the problem of loss of inverse pyramid transform details is effectively made up with the super-resolution technology; (5) the structure is complete, multi-thread operation can be performed, and the program is robust; and (6) the detail images are used to perform significant enhancement and differentiation to improve the generalization ability of the algorithm.

DESCRIPTION OF DRAWINGS

FIG. 1 shows infrared and visible light images acquired by an acquisition platform and a schematic diagram of pyramid multi-scale decomposition.

FIG. 2 is a step chart of overall image acquisition and fusion.

FIG. 3 is a flow chart of the present invention.

FIG. 4 is a final fusion image of the present invention.

DETAILED DESCRIPTION

The present invention proposes a method for real-time image fusion by an infrared camera and a visible light camera, and will be described in detail below in combination with drawings and embodiments.

A binocular stereo camera is placed on a fixed platform. In the embodiment, the image resolution of the camera is 1280×720, and the field angle is 45.4°; and the experimental platform is shown in FIG. 1, and NVIDIATX2 is used for calculation to ensure timeliness. On this basis, an infrared and visible light fusion method is provided, and the method comprises the following steps:

1) Obtaining registered infrared and visible light images

1-1) Respectively calibrating each lens of the visible light binocular camera and the infrared binocular camera and jointly calibrating the respective systems;

1-2) Respectively calibrating the infrared binocular camera and the visible light binocular camera by the Zhangzhengyou calibration method to obtain internal parameters such as focal length and principal point position and external parameters such as rotation and translation of each camera;

1-3) Calculating the positional relationship of the same plane in the visible light image and the infrared image by using the external parameter RT obtained by the joint calibration method and the detected checker corners, and registering the visible light image to the infrared image by using a homography matrix.

2) Performing multi-scale pyramid transform on the images, and using the designed filtering template to respectively perform down-convolution and down-sampling on the infrared image and the visible light image, as shown in FIG. 1, wherein the filtering template acquisition mode is shown in the formula below:

$\begin{matrix} {{h\left( {x,y} \right)} = e^{- \frac{x^{2} + y^{2}}{2\sigma^{2}}}} & (1) \end{matrix}$

wherein x is the distance between other pixels and the center pixel in the neighborhood; y is the distance between other pixels and the center pixel in the neighborhood; and a is a standard deviation parameter.

3) Extracting the details of the infrared and visible light images based on multi-scale pyramid transform, and using the high frequency of the images obtained by the linear interpolation method as the detail layer of fusion.

The following is the main flow of the algorithm as shown in FIG. 2, and the specific description is as follows:

4) Converting the color space of the image

4-1) In view of the problem that the visible light image has RGB three channels, converting the RGB color space to the HSV color space, extracting the V (value) information of the visible light image to be fused with the infrared image, and retaining H (hue) and S (saturation), wherein the specific conversion is shown as follows:

$\begin{matrix} {R^{\prime} = \frac{R}{255}} & (2) \\ {G^{\prime} = \frac{G}{255}} & (3) \\ {B^{\prime} = \frac{B}{255}} & (4) \\ {{C\;\max} = {\max\mspace{11mu}\left( {R^{\prime},G^{\prime},B^{\prime}} \right)}} & (5) \\ {{C\;\min} = {\min\mspace{11mu}\left( {R^{\prime},G^{\prime},B^{\prime}} \right)}} & (6) \\ {\Delta = {{C\;\max} - {C\;\min}}} & (7) \\ {V = {C\;\max}} & (8) \end{matrix}$

wherein R is a red channel, G is a green channel, and B is a blue channel; R′ is the red channel after color space conversion, G′ is the green channel after color space conversion, and B′ is the blue channel after color space conversion; Cmax represents the maximum value among R′, G′, B′; Cmin represents the minimum value among R′, G′, B′; and Δ represents the difference between the maximum value and the minimum value among R′, G′, B′;

4-2) Extracting the V (value) channel as the input of visible light, retaining the hue H and saturation S to the corresponding matrix, and retaining the color information for the subsequent color restoration after fusion.

5) Convolving details by filtering, and filtering infrared and visible light detail images

5-1) Designing two 3×3 empty matrixes, starting convolution sequentially from the starting pixels of the two images, distinguishing eight neighborhood pixels of the corresponding points in the visible light and infrared detail images, distinguishing the saliency stationary points of the corresponding neighborhood pixels, taking 1 for large ones and 0 for small ones, and respectively saving in the corresponding matrixes; and updating sequentially till the last pixel of the image;

5-2) According to the weight of the generated matrix, fusing the detail images of the infrared and visible light images to generate a detail image with richer texture.

6) Performing inverse multi-scale pyramid transform by using image super-resolution

6-1) Selecting the cubic convolution interpolation super-resolution algorithm for inverse multi-scale pyramid transform; from the deepest down-sampled sub-image, after fusing the detail images, expanding the image to the second deepest sub-image by super-resolution, and iterating successively until restoring to the original image size; with a pixel as an example, the distances between the pixel and the pixel to be determined in the vertical and horizontal directions are respectively 1+u and v, the weight of the pixel is w=w(1+u)×w(v), and then the pixel value f(i+u,j+v) of the pixel to be determined is calculated as follows:

f(i+u,j+v)=A×Q×P  (9)

wherein A, Q and P are matrixes generated by the distances; and A=[w(1+u) w(u) w(1−u) w(2−u)];

$\mspace{20mu}{{P = \begin{bmatrix} {w\left( {1 + v} \right)} & {{w(v)}\ } & {\ {w\left( {1 - v} \right)}} & {\ {w\left( {2 - v} \right)}} \end{bmatrix}^{T}};}$ $Q = {\begin{bmatrix} {f\left( {{i - 1},{j - 1}} \right)} & {f\left( {{i - 1},{j + 0}} \right)} & {f\left( {{i - 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 0},{j - 1}} \right)} & {f\left( {{i + 0},{j + 0}} \right)} & {f\left( {{i + 0},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 1},{j - 1}} \right)} & {f\left( {{i + 1},{j + 0}} \right)} & {f\left( {{i + 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 2},{j - 1}} \right)} & {f\left( {{i + 2},{j + 0}} \right)} & {f\left( {{i + 2},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \end{bmatrix}.}$

The interpolation kernel w(x) is:

$\begin{matrix} {{w =}\left\{ \begin{matrix} {1 - {2{x}^{2}} + {x}^{3}} & {{x} < 1} \\ {4 - {8{x}} + {5{x}^{2}} - {x}^{3}} & {1 \leq {x} < 2} \\ 0 & {{x} \geq 2} \end{matrix} \right.} & (10) \end{matrix}$

Finally, according to the weight and value of the pixel, calculating the pixel value of the corresponding position of the pixel after super-resolution;

6-2) Saving the super-resolved fusion image in a newly established zero matrix to prepare for the next step.

7) Performing restoration and color enhancement on the color space

7-1) Restoring to the RGB color space from the HSV color space by saving the super-resolved fusion image into the V (value) channel for updating in combination with the previously retained H (hue) and S (saturation), wherein the specific update formulas are shown as follows:

$\begin{matrix} {C = {V \times S}} & (11) \\ {X = {C \times \left( {1 - {{{\left( {H\text{/}60{^\circ}} \right)\mspace{11mu}{mod}\mspace{11mu} 2} - 1}}} \right)}} & (12) \\ {m = {V - C}} & (13) \\ {\left( {R^{\prime},G^{\prime},B^{\prime}} \right) = \left\{ \begin{matrix} {\left( {C,X,0} \right),} & {{0{^\circ}} \leq H < {60{^\circ}}} \\ {\left( {X,C,0} \right),} & {{60{^\circ}} \leq H < {120{^\circ}}} \\ {\left( {0,C,X} \right),} & {{120{^\circ}} \leq H < {180{^\circ}}} \\ {\left( {0,X,C} \right),} & {{180{^\circ}} \leq H < {240{^\circ}}} \\ {\left( {X,0,C} \right),} & {{240{^\circ}} \leq H < {300{^\circ}}} \\ {\left( {C,0,X} \right),} & {{300{^\circ}} \leq H < {360{^\circ}}} \end{matrix} \right.} & (14) \\ {R^{\prime},G^{\prime},{B^{\prime} = \left( {{\left( {R^{\prime} + m} \right) \times 255},{\left( {G^{\prime} + m} \right) \times 255},{\left( {B^{\prime} + m} \right) \times 255}} \right)}} & \left( {15} \right) \end{matrix}$

wherein C is the product of the value and the saturation; and m is the difference between the value and C.

7-2) Performing color correction and enhancement on the image restored in step 7-1) to generate a three-channel image that is more consistent with observation and detection; and performing color enhancement on the R channel, G channel and B channel respectively, as shown in the formulas below:

$\begin{matrix} {R_{out} = \left( R_{in} \right)^{{1/g}amma}} & (16) \\ {R_{display} = \left( R_{in}^{({{1/g}amma})} \right)^{gamma}} & (17) \\ {G_{out} = \left( G_{in} \right)^{{1/g}amma}} & (18) \\ \left. {G_{display} = \left( G_{in} \right)^{({{1/g}amma})}} \right)^{gamma} & (19) \\ {B_{out} = \left( B_{in} \right)^{{1/g}amma}} & (20) \\ {B_{display} = \left( B_{in}^{({{1/g}amma})} \right)^{gamma}} & (21) \end{matrix}$

wherein gamma is a brightness enhancement parameter; R_(out) is inverse transform of the red channel after gamma correction; R_(in) is the value of the initial red channel; R_(display) is the value of the R channel after gamma correction; G_(display) is the value of the G channel after gamma correction; and B_(display) is the numerical compensation value of the B channel after gamma correction. The generated image is shown in FIG. 4. 

1. An infrared and visible light fusion method, wherein the fusion method comprises the following steps: 1) obtaining registered infrared and visible light images; 2) performing multi-scale pyramid transform on the images, and using the designed filtering template to respectively perform down-convolution and down-sampling on the infrared image and the visible light image, wherein the filtering template acquisition mode is: $\begin{matrix} {{h\left( {x,y} \right)} = e^{- \frac{x^{2} + y^{2}}{2\sigma^{2}}}} & (1) \end{matrix}$ wherein x is the distance between other pixels and the center pixel in the neighborhood; y is the distance between other pixels and the center pixel in the neighborhood; and a is a standard deviation parameter; 3) extracting the details of the infrared images and visible light image based on multi-scale pyramid transform, and using the high frequency of the images obtained by the linear interpolation method as the detail layer of fusion; 4) converting the color space of the image; 5) convolving details by filtering, and filtering infrared and visible light detail images; 6) performing inverse multi-scale pyramid transform by using image super-resolution; 7) performing restoration and color enhancement on the color space.
 2. The infrared and visible light fusion method according to claim 1, wherein step 1) comprises the following specific steps: 1-1) respectively calibrating each lens of the visible light binocular camera and the infrared binocular camera and jointly calibrating the respective systems; 1-2) respectively calibrating the infrared binocular camera and the visible light binocular camera by the Zhangzhengyou calibration method to obtain internal parameters and external parameters of each camera, wherein the internal parameters include focal length and principal point position, and the external parameters include rotation and translation; 1-3) calculating the positional relationship of the same plane in the visible light image and the infrared image by using the external parameter RT obtained by the joint calibration method and the detected checker corners, and registering the visible light image to the infrared image by using a homography matrix.
 3. The infrared and visible light fusion method according to claim 1, wherein step 4) comprises the following specific steps: 4-1) when the visible light image has RGB three channels, converting the RGB color space to the HSV color space, wherein the specific conversion is shown as follows: $\begin{matrix} {R^{\prime} = \frac{R}{255}} & (2) \\ {G^{\prime} = \frac{G}{255}} & (3) \\ {B^{\prime} = \frac{B}{255}} & (4) \\ {{C\;\max} = {\max\mspace{11mu}\left( {R^{\prime},G^{\prime},B^{\prime}} \right)}} & (5) \\ {{C\;\min} = {\min\mspace{11mu}\left( {R^{\prime},G^{\prime},B^{\prime}} \right)}} & (6) \\ {\Delta = {{C\;\max} - {C\;\min}}} & (7) \\ {V = {C\;\max}} & (8) \end{matrix}$ wherein R is a red channel, G is a green channel, and B is a blue channel; R′ is the red channel after color space conversion, G′ is the green channel after color space conversion, and B′ is the blue channel after color space conversion; Cmax represents the maximum value among R′, G′, B′; Cmin represents the minimum value among R′, G′, B′; and Δ represents the difference between the maximum value and the minimum value among R′, G′, B′; 4-2) extracting the value information V as the input of visible light, retaining the hue H and saturation S to the corresponding matrix, and retaining the color information for the subsequent color restoration after fusion.
 4. The infrared and visible light fusion method according to claim 1, wherein step 5) comprises the following specific steps: 5-1) designing two 3×3 empty matrixes, starting convolution sequentially from the starting pixels of the two images, distinguishing eight neighborhood pixels of the corresponding points in the visible light and infrared detail images, distinguishing the saliency stationary points of the corresponding neighborhood pixels, taking 1 for large ones and 0 for small ones, and respectively saving in the corresponding matrixes; and updating sequentially till the last pixel of the image; 5-2) according to the weight of the generated matrix, fusing the detail images of the infrared and visible light images to generate a detail image with rich texture.
 5. The infrared and visible light fusion method according to claim 3, wherein step 5) comprises the following specific steps: 5-1) designing two 3×3 empty matrixes, starting convolution sequentially from the starting pixels of the two images, distinguishing eight neighborhood pixels of the corresponding points in the visible light and infrared detail images, distinguishing the saliency stationary points of the corresponding neighborhood pixels, taking 1 for large ones and 0 for small ones, and respectively saving in the corresponding matrixes; and updating sequentially till the last pixel of the image; 5-2) according to the weight of the generated matrix, fusing the detail images of the infrared and visible light images to generate a detail image with rich texture.
 6. The infrared and visible light fusion method according to claim 1, wherein step 6) comprises the following specific steps: 6-1) selecting the cubic convolution interpolation super-resolution algorithm for inverse multi-scale pyramid transform; from the deepest down-sampled sub-image, after fusing the detail images, expanding the image to the second deepest sub-image by super-resolution, and iterating successively until restoring to the original image size; in the case of a pixel, the distances between the pixel and the pixel to be determined in the vertical and horizontal directions are respectively 1+u and v, the weight of the pixel is w=w(1+u)×w(v), and then the pixel value f (i+u, j+v) of the pixel to be determined is calculated as follows: f(i+u,j+v)=A×Q×P  (9) wherein A, Q and P are matrixes generated by the distances; and A=[w(1+u) w(u) w(1−u) w(2−u)]; $\mspace{20mu}{{P = \begin{bmatrix} {w\left( {1 + v} \right)} & {{w(v)}\ } & {\ {w\left( {1 - v} \right)}} & {\ {w\left( {2 - v} \right)}} \end{bmatrix}^{T}};}$ ${Q = \begin{bmatrix} {f\left( {{i - 1},{j - 1}} \right)} & {f\left( {{i - 1},{j + 0}} \right)} & {f\left( {{i - 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 0},{j - 1}} \right)} & {f\left( {{i + 0},{j + 0}} \right)} & {f\left( {{i + 0},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 1},{j - 1}} \right)} & {f\left( {{i + 1},{j + 0}} \right)} & {f\left( {{i + 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 2},{j - 1}} \right)} & {f\left( {{i + 2},{j + 0}} \right)} & {f\left( {{i + 2},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \end{bmatrix}};$ the interpolation kernel w(x) is: $\begin{matrix} {{w =}\left\{ \begin{matrix} {1 - {2{x}^{2}} + {x}^{3}} & {{x} < 1} \\ {4 - {8{x}} + {5{x}^{2}} - {x}^{3}} & {1 \leq {x} < 2} \\ 0 & {{x} \geq 2} \end{matrix} \right.} & (10) \end{matrix}$ finally, according to the weight and value of the pixel, calculating the pixel value of the corresponding position of the pixel after super-resolution; 6-2) saving the super-resolved fusion image in a newly established zero matrix.
 7. The infrared and visible light fusion method according to claim 4, wherein step 6) comprises the following specific steps: 6-1) selecting the cubic convolution interpolation super-resolution algorithm for inverse multi-scale pyramid transform; from the deepest down-sampled sub-image, after fusing the detail images, expanding the image to the second deepest sub-image by super-resolution, and iterating successively until restoring to the original image size; in the case of a pixel, the distances between the pixel and the pixel to be determined in the vertical and horizontal directions are respectively 1+u and v, the weight of the pixel is w=w(1+u)×w(v), and then the pixel value f (i+u, j+v) of the pixel to be determined is calculated as follows: f(i+u,j+v)=A×Q×P  (9) wherein A, Q and P are matrixes generated by the distances; and A=[w(1+u) w(u) w(1−u) w(2−u)]; $\mspace{20mu}{{P = \begin{bmatrix} {w\left( {1 + v} \right)} & {{w(v)}\ } & {\ {w\left( {1 - v} \right)}} & {\ {w\left( {2 - v} \right)}} \end{bmatrix}^{T}};}$ ${Q = \begin{bmatrix} {f\left( {{i - 1},{j - 1}} \right)} & {f\left( {{i - 1},{j + 0}} \right)} & {f\left( {{i - 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 0},{j - 1}} \right)} & {f\left( {{i + 0},{j + 0}} \right)} & {f\left( {{i + 0},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 1},{j - 1}} \right)} & {f\left( {{i + 1},{j + 0}} \right)} & {f\left( {{i + 1},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \\ {f\left( {{i + 2},{j - 1}} \right)} & {f\left( {{i + 2},{j + 0}} \right)} & {f\left( {{i + 2},{j + 1}} \right)} & {f\left( {{i + 2},{j + 2}} \right)} \end{bmatrix}};$ the interpolation kernel w(x) is: $\begin{matrix} {{w =}\left\{ \begin{matrix} {1 - {2{x}^{2}} + {x}^{3}} & {{x} < 1} \\ {4 - {8{x}} + {5{x}^{2}} - {x}^{3}} & {1 \leq {x} < 2} \\ 0 & {{x} \geq 2} \end{matrix} \right.} & (10) \end{matrix}$ finally, according to the weight and value of the pixel, calculating the pixel value of the corresponding position of the pixel after super-resolution; 6-2) saving the super-resolved fusion image in a newly established zero matrix.
 8. The infrared and visible light fusion method according to claim 1, wherein step 7) comprises the following specific steps: 7-1) restoring to the RGB color space from the HSV color space by saving the super-resolved fusion image into the value information V for updating in combination with the previously retained H (hue) and S (saturation), wherein the specific formulas are shown as follows: $\begin{matrix} {C = {V \times S}} & (11) \\ {X = {C \times \left( {1 - {{{\left( {H\text{/}60{^\circ}} \right)\mspace{11mu}{mod}\mspace{11mu} 2} - 1}}} \right)}} & (12) \\ {m = {V - C}} & (13) \\ {\left( {R^{\prime},G^{\prime},B^{\prime}} \right) = \left\{ \begin{matrix} {\left( {C,X,0} \right),} & {{0{^\circ}} \leq H < {60{^\circ}}} \\ {\left( {X,C,0} \right),} & {{60{^\circ}} \leq H < {120{^\circ}}} \\ {\left( {0,C,X} \right),} & {{120{^\circ}} \leq H < {180{^\circ}}} \\ {\left( {0,X,C} \right),} & {{180{^\circ}} \leq H < {240{^\circ}}} \\ {\left( {X,0,C} \right),} & {{240{^\circ}} \leq H < {300{^\circ}}} \\ {\left( {C,0,X} \right),} & {{300{^\circ}} \leq H < {360{^\circ}}} \end{matrix} \right.} & (14) \\ {R^{\prime},G^{\prime},{B^{\prime} = \left( {{\left( {R^{\prime} + m} \right) \times 255},{\left( {G^{\prime} + m} \right) \times 255},{\left( {B^{\prime} + m} \right) \times 255}} \right)}} & \left( {15} \right) \end{matrix}$ wherein C is the product of the value and the saturation; and m is the difference between the value and C; 7-2) performing color correction and enhancement on the image restored in step 7-1) to generate a three-channel image that is consistent with observation and detection; and performing color enhancement on the R channel, G channel and B channel respectively, wherein the specific formulas are shown as follows: $\begin{matrix} {R_{out} = \left( R_{in} \right)^{{1/g}amma}} & (16) \\ {R_{display} = \left( R_{in}^{({{1/g}amma})} \right)^{gamma}} & (17) \\ {G_{out} = \left( G_{in} \right)^{{1/g}amma}} & (18) \\ \left. {G_{display} = \left( G_{in} \right)^{({{1/g}amma})}} \right)^{gamma} & (19) \\ {B_{out} = \left( B_{in} \right)^{{1/g}amma}} & (20) \\ {B_{display} = \left( B_{in}^{({{1/g}amma})} \right)^{gamma}} & (21) \end{matrix}$ wherein gamma is a brightness enhancement parameter; R_(out) is inverse transform of the red channel after gamma correction; R_(in) is the value of the initial red channel; R_(display) is the value of the R channel after gamma correction; G_(display) is the value of the G channel after gamma correction; and B_(display) is the numerical compensation value of the B channel after gamma correction. 